training set - meaning and definition. What is training set
Diclib.com
Online Dictionary

What (who) is training set - definition

THREE DATASETS USED IN MACHINE LEARNING
Training set; Validation set; Dataset (machine learning); Training data set; Training data; Test set; Training, test and validation sets; Training, test, and validation sets; Out-of-sample; Trainable parameter; Trained parameter; Train parameter; Model training; Training, validation, and test sets; Holdout data set; Training, validation, and test datasets; Training, validation, and test data; Training dataset

Training, validation, and test data sets         
In machine learning, a common task is the study and construction of algorithms that can learn from and make predictions on data. Such algorithms function by making data-driven predictions or decisions, through building a mathematical model from input data.
Set (mathematics)         
  • The [[natural numbers]] <math>\mathbb{N}</math> are contained in the [[integers]] <math>\mathbb{Z}</math>, which are contained in the [[rational numbers]] <math>\mathbb{Q}</math>, which are contained in the [[real numbers]] <math>\mathbb{R}</math>, which are contained in the [[complex numbers]] <math>\mathbb{C}</math>
  • Passage with a translation of the original set definition of Georg Cantor. The German word ''Menge'' for ''set'' is translated with ''aggregate'' here.
  • ''A'' ∩ ''B''}}</div>
  • ''A'' \ ''B''}}</div>
  • <div class="center">The '''symmetric difference''' of ''A'' and ''B''</div>
  • ''A'' ∪ ''B''}}</div>
  • <div class="center">The '''complement''' of ''A'' in ''U''</div>
  • ''A'' is a subset of ''B''.<br>''B'' is a superset of ''A''.
WELL-DEFINED MATHEMATICAL COLLECTION OF DISTINCT OBJECTS
Set (math); Crisp set; Conventional set; Number sets; Set (mathematical); Mathematical set; Set logic; Basic set operations; Finite subset
A set is the mathematical model for a collection of different things; a set contains elements or members, which can be mathematical objects of any kind: numbers, symbols, points in space, lines, other geometrical shapes, variables, or even other sets. The set with no element is the empty set; a set with a single element is a singleton.
Recruit training         
  • A formation of USAF airmen
  • U.S. Army recruits being instructed
  • In March 2002, a U.S. Navy Recruit Division Commander conducts "Instructional Training" to correct substandard performance during boot camp.
  • A U.S. Marine Corps Drill Instructor works with enlistees, or individuals who have not left yet for recruit training. U.S. Marine Corps photo by Sgt. Kate Busto/Released
  • Polish army]] recruits on [[foot drill]], 2007
  • [[Royal Military College of Canada]] cadets compete in the prestigious Sandhurst Competition.
  • US Army [[drill sergeant]]s training a recruit
  • A Coast Guard Company Commander instructs a recruit during recruit training.
  • US Marine Corps]] is shaved before his initial training begins, 2006.
  • U.S. Navy recruits complete their initial training with a graduation parade, 2011.
INITIAL INDOCTRINATION AND INSTRUCTION GIVEN TO NEW MILITARY PERSONNEL
Basic training; Military basic training; Combat Basic Training; Basic Combat Training; Basic Military Training; Combat Conditioning; Recruit Training; Basic Training Regiment; Basic military training; Boot-camp; Boot training; Initial training (military); Basic training (military); Initial training; Military preparation; Recruit training
Recruit training, more commonly known as basic training or regularly boot camp, refers to the initial instruction of new military personnel. Recruit training is a physically and psychologically intensive process, which resocializes its subjects for the demands of military employment.

Wikipedia

Training, validation, and test data sets

In machine learning, a common task is the study and construction of algorithms that can learn from and make predictions on data. Such algorithms function by making data-driven predictions or decisions, through building a mathematical model from input data. These input data used to build the model are usually divided into multiple data sets. In particular, three data sets are commonly used in different stages of the creation of the model: training, validation, and test sets.

The model is initially fit on a training data set, which is a set of examples used to fit the parameters (e.g. weights of connections between neurons in artificial neural networks) of the model. The model (e.g. a naive Bayes classifier) is trained on the training data set using a supervised learning method, for example using optimization methods such as gradient descent or stochastic gradient descent. In practice, the training data set often consists of pairs of an input vector (or scalar) and the corresponding output vector (or scalar), where the answer key is commonly denoted as the target (or label). The current model is run with the training data set and produces a result, which is then compared with the target, for each input vector in the training data set. Based on the result of the comparison and the specific learning algorithm being used, the parameters of the model are adjusted. The model fitting can include both variable selection and parameter estimation.

Successively, the fitted model is used to predict the responses for the observations in a second data set called the validation data set. The validation data set provides an unbiased evaluation of a model fit on the training data set while tuning the model's hyperparameters (e.g. the number of hidden units—layers and layer widths—in a neural network). Validation datasets can be used for regularization by early stopping (stopping training when the error on the validation data set increases, as this is a sign of over-fitting to the training data set). This simple procedure is complicated in practice by the fact that the validation dataset's error may fluctuate during training, producing multiple local minima. This complication has led to the creation of many ad-hoc rules for deciding when over-fitting has truly begun.

Finally, the test data set is a data set used to provide an unbiased evaluation of a final model fit on the training data set. If the data in the test data set has never been used in training (for example in cross-validation), the test data set is also called a holdout data set. The term "validation set" is sometimes used instead of "test set" in some literature (e.g., if the original data set was partitioned into only two subsets, the test set might be referred to as the validation set).

Deciding the sizes and strategies for data set division in training, test and validation sets is very dependent on the problem and data available.

Pronunciation examples for training set
1. They generalize, also, well beyond the training set.
The Plastic Tide _ Peter Kohler & Stefan Leutenegger _ Talks at Google
2. that a lot of their tumors in their training set
Janelle Shane _ You Look Like a Thing and I Love You _ Talks at Google
3. and that was used as a training set in order
New Galaxy Formation Insights _ Joel Primack _ Talks at Google
4. And so building a training set of zorilllas
The Crowd and the Cosmos - Adventures in the Zooniverse _ Chris Lintott _ Talks at Google
5. to get enough to be a useful training set
New Galaxy Formation Insights _ Joel Primack _ Talks at Google
Examples of use of training set
1. "The police‘s entire training set–up can continue operating under the human resources department, instead of appointing another senior commander to head it," he said.